Adaptive Averaging in Accelerated Descent Dynamics

نویسندگان

  • Walid Krichene
  • Alexandre M. Bayen
  • Peter L. Bartlett
چکیده

We study accelerated descent dynamics for constrained convex optimization. This dynamics can be described naturally as a coupling of a dual variable accumulating gradients at a given rate η(t), and a primal variable obtained as the weighted average of the mirrored dual trajectory, with weights w(t). Using a Lyapunov argument, we give sufficient conditions on η and w to achieve a desired convergence rate. As an example, we show that the replicator dynamics (an example of mirror descent on the simplex) can be accelerated using a simple averaging scheme. We then propose an adaptive averaging heuristic which adaptively computes the weights to speed up the decrease of the Lyapunov function. We provide guarantees on adaptive averaging in continuous-time, and give numerical experiments in discrete-time to compare it with existing heuristics, such as adaptive restarting. The experiments indicate that adaptive averaging performs at least as well as adaptive restarting, with significant improvements in some cases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acceleration and Averaging in Stochastic Descent Dynamics

[1] Nemirovski and Yudin. Problems Complexity and Method Efficiency in Optimization. Wiley-Interscience series in discrete mathematics. Wiley, 1983. [2] W. Krichene, A. Bayen and P. Bartlett. Accelerated Mirror Descent in Continuous and Discrete Time. NIPS 2015. [3] W. Su, S. Boyd and E. Candes. A differential equation for modeling Nesterov's accelerated gradient method: theory and insights. NI...

متن کامل

Randomized Block Subgradient Methods for Convex Nonsmooth and Stochastic Optimization

Block coordinate descent methods and stochastic subgradient methods have been extensively studied in optimization and machine learning. By combining randomized block sampling with stochastic subgradient methods based on dual averaging ([22, 36]), we present stochastic block dual averaging (SBDA)—a novel class of block subgradient methods for convex nonsmooth and stochastic optimization. SBDA re...

متن کامل

Accelerated Mirror Descent in Continuous and Discrete Time

We study accelerated mirror descent dynamics in continuous and discrete time. Combining the original continuous-time motivation of mirror descent with a recent ODE interpretation of Nesterov’s accelerated method, we propose a family of continuous-time descent dynamics for convex functions with Lipschitz gradients, such that the solution trajectories converge to the optimum at a O(1/t2) rate. We...

متن کامل

Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks

Effective training of deep neural networks suffers from two main issues. The first is that the parameter spaces of these models exhibit pathological curvature. Recent methods address this problem by using adaptive preconditioning for Stochastic Gradient Descent (SGD). These methods improve convergence by adapting to the local geometry of parameter space. A second issue is overfitting, which is ...

متن کامل

A survey of Algorithms and Analysis for Adaptive Online Learning

We present tools for the analysis of Follow-The-Regularized-Leader (FTRL), Dual Averaging, and Mirror Descent algorithms when the regularizer (equivalently, proxfunction or learning rate schedule) is chosen adaptively based on the data. Adaptivity can be used to prove regret bounds that hold on every round, and also allows for data-dependent regret bounds as in AdaGrad-style algorithms (e.g., O...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016